Analyzing lexical change in diachronic corpora
نویسندگان
چکیده
منابع مشابه
Diachronic Stylistic Changes in British and American Varieties of 20th Century Written English Language
In this paper we present the results of a study investigating the diachronic changes of four stylistic features: average sentence length, Automated Readability Index, lexical density and lexical richness in 20th century written English language. All experiments were conducted on the largest existing diachronic corpora of British and American English – the Brown ‘family’ corpora, employing NLP t...
متن کاملUsing Comparable Corpora to Track Diachronic and Synchronic Changes in Lexical Density and Lexical Richness
This study from the area of language variation and change is based on exploitation of the comparable diachronic and synchronic corpora of 20th century British and American English language (the ‘Brown family’ of corpora). We investigate recent changes of lexical density and lexical richness in two consecutive thirty-year time gaps in British English (1931–1961 and 1961–1991) and in 1961–1992 in...
متن کاملFrom semi-automatic to automatic affix extraction in Middle English corpora: Building a sustainable database for analyzing derivational morphology over time
The annotation of large corpora is usually restricted to syntactic structure and word class. Pure lexical information and information on the structure of words are stored in specialized dictionaries (Baayen et al., 1995). Both data structures – dictionary and text corpus – can be matched to get e.g. a distribution of certain (restricted) lexical information from a text. This procedure works fin...
متن کاملThe Diachronic Change of German Nominalization Patterns: An Increase in Prototypicality
This paper aims at accounting for the emergence and loss of constraints governing the formation of deverbal nominalizations in German from a cognitive point of view. Specifically, diachronic changes in the formation of derivatives in the suffix -ung are investigated on the basis of two large corpora of Middle High German (MHG, 1050-1350) and Early New High German (ENHG, 1350-1650) texts, respec...
متن کاملMultiple Tokenizations in a Diachronic Corpus
This paper deals with the construction of a maximally flexible corpus architecture for building and analyzing diachronic corpora. Historical data poses many challenges with regard to representation and analysis, and diachronic corpora are even more varied and unsystematic (Claridge, 2008). Since historical and diachronic corpora are so difficult and expensive to build, it is crucial that they b...
متن کامل